Skip to content

Conversation

@drbenvincent
Copy link
Collaborator

@drbenvincent drbenvincent commented Dec 5, 2025

This pull request adds support for event study analysis to the CausalPy package. The main changes include introducing the new EventStudy class, updating the package exports to include it, and providing a utility function for generating synthetic panel data suitable for event study and dynamic DiD analyses.

  • This implementation does not deal with staggered treatments. Add a warning callout box, as data validation with exception
  • Update model description - the model being used is not static, it depends on the user provided formula. Give some examples.
  • Add an example where we remove the time fixed effects and add in other predictors or some form of seasonal trend.
  • Check the integration with the reporting layer.
  • Consider adding the time relative to treatment column in the notebook, not hidden in the experiment class.
  • Ensure we are exposing plot options via arguments
  • Simplify data plot using a 1-liner with seaborn
  • Why is the event study column not appearing in the table on home page?

⚠️ Includes a more widespread big fix identified by bugbot here #584 (comment) which will affect multiple experiment classes.


📚 Documentation preview 📚: https://causalpy--584.org.readthedocs.build/en/584/

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@codecov
Copy link

codecov bot commented Dec 5, 2025

Codecov Report

❌ Patch coverage is 97.11628% with 31 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.63%. Comparing base (6795c14) to head (88a3470).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
causalpy/reporting.py 71.18% 7 Missing and 10 partials ⚠️
causalpy/tests/test_event_study.py 97.50% 0 Missing and 8 partials ⚠️
causalpy/experiments/event_study.py 97.34% 1 Missing and 5 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #584      +/-   ##
==========================================
+ Coverage   93.27%   94.63%   +1.36%     
==========================================
  Files          37       40       +3     
  Lines        5632     6696    +1064     
  Branches      367      460      +93     
==========================================
+ Hits         5253     6337    +1084     
+ Misses        248      206      -42     
- Partials      131      153      +22     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

The EventStudy class now requires a patsy-style formula to specify the outcome and fixed effects, removing the separate outcome_col argument. Design matrix construction uses patsy, and event-time dummies are appended. Input validation checks for formula presence, and tests and documentation are updated to reflect the new API and output format.
@drbenvincent
Copy link
Collaborator Author

@cursor review

Expanded documentation to explain the patsy formula syntax, the role of unit and time fixed effects, and how event-time dummies ($\beta_k$) are automatically constructed by the EventStudy class. Added details on the event window and reference event time parameters for clearer guidance.
Added a warning in the EventStudy class and documentation that the implementation only supports simultaneous treatment timing and does not support staggered adoption. Introduced a validation to raise a DataException if treated units have different treatment times. Added a corresponding test to ensure staggered adoption raises an error, and updated the notebook to clarify estimator limitations.
Enhanced the _bayesian_plot and _ols_plot methods in EventStudy to support configurable figure size and HDI probability. Updated docstrings to document new parameters and improved plot labeling for clarity.
Introduces event study support to effect_summary(), including parallel trends check and dynamic effect reporting. Updates event study class to allow HDI probability customization and reporting, and extends documentation with effect summary usage and interpretation.
The `generate_event_study_data` function now supports optional time-varying predictors generated as AR(1) processes, controlled by new parameters: `predictor_effects`, `ar_phi`, and `ar_scale`. Added the `generate_ar1_series` utility function. Updated docstrings and examples to reflect these changes. The event study PyMC notebook was updated with additional analysis and improved section headings.
Introduces integration tests for the EventStudy.effect_summary method using both PyMC and sklearn models. Tests verify the returned EffectSummary object, its table and text attributes, and key output elements.
@drbenvincent drbenvincent marked this pull request as ready for review December 6, 2025 11:48
@juanitorduz
Copy link
Collaborator

I will take a look in the next few days :)

@nialloulton
Copy link

@cursor review

@cursor
Copy link

cursor bot commented Dec 6, 2025

PR Summary

Adds an Event Studies section with the event_study_pymc.ipynb notebook to the docs and cites Sun & Abraham (2021) in references.

  • Docs:
    • Notebooks index: Add an Event Studies toctree with event_study_pymc.ipynb in docs/source/notebooks/index.md.
    • References: Add bibliographic entry for Sun & Abraham (2021) in docs/source/references.bib.

Written by Cursor Bugbot for commit 786a5f8. This will update automatically on new commits. Configure here.

@nialloulton
Copy link

bugbot run

Added Event Study to the experiment support table in reporting_statistics.md and updated AGENTS.md to instruct updating the table when adding new experiment types.
This commit adds extensive new tests to test_reporting.py and test_synthetic_data.py, covering error handling, experiment type detection, OLS statistics edge cases, prose and table generation for various models, and all synthetic data generation utilities. These tests improve coverage and robustness for reporting and data simulation functions.
Expanded documentation in both the EventStudy class and the event study PyMC notebook to explain the equivalence between indicator functions and dummy variables. Added details on how dummy variables are constructed for each event time, the omission of the reference period to avoid multicollinearity, and the interpretation of regression coefficients as ATT at each event time.
@review-notebook-app
Copy link

review-notebook-app bot commented Dec 15, 2025

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-12-15T12:05:27Z
----------------------------------------------------------------

What about adding https://arxiv.org/pdf/2503.13323 as a reference?


drbenvincent commented on 2025-12-23T15:32:57Z
----------------------------------------------------------------

Done, along with some other citations

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 15, 2025

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-12-15T12:05:27Z
----------------------------------------------------------------

Can we say something about the inferred parameters? (e.g., a trace plot of some parameters? I am asking because high R-hats


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 15, 2025

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-12-15T12:05:28Z
----------------------------------------------------------------

again here: shall we inspect the model sampling?


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 15, 2025

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-12-15T12:05:29Z
----------------------------------------------------------------

Can we also plot the real true effects here?


Copy link
Collaborator

@juanitorduz juanitorduz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! This is super helpful in marketing (the effect of a marketing campaign depends on time!). We should try to find other real datasets to showcase this.

Minor comments.

Suggestion -> add more tests for the indidual methods and address @BugBot comments

).astype(float)

# Combine patsy design matrix with event-time dummies
X_df = pd.concat([X_df, event_time_dummies], axis=1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@drbenvincent, if the number of units gets very big, then fitting all these dummies will become hard. What about adding the option to "demean" the data instead of using https://github.com/py-econometrics/pyfixest ? We can still use Bayesian methods to do inference after the demeaning procedure (just an idea)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, but a) not sure if we need an expternal package for de-meaning? b) let's have this as an iteration on top of the first mvp (i.e. yes, but not in this PR)

Control units marked with np.inf or np.nan in the treat_time column are now correctly excluded from treated unit checks in EventStudy. Adds tests to ensure control units with np.inf, np.nan, or mixed markers are handled properly, preventing false staggered adoption errors.
Improved rounding logic in EventStudy.get_event_time_summary for PyMC and sklearn models, ensuring consistent application of the round_to parameter. Refactored and expanded test_event_study.py to cover more edge cases, input validation, and integration scenarios, including new tests for rounding, event window handling, and control units. Cleaned up and reorganized tests for clarity and maintainability.
Updates all experiment classes to filter out rows with NaN values after patsy design matrix construction, ensuring consistent shapes between data and design matrices. Adds comprehensive tests for NaN handling across all experiment classes to verify correct filtering and prevent shape mismatch errors.
Enhanced the EventStudy class to emit warnings for unbalanced panels, gaps in time periods, and excessive data loss due to NaN filtering. Added comprehensive unit tests to cover these edge cases and verify warning behavior. Updated interrogate badge to reflect increased coverage.
Copy link
Collaborator Author

Done, along with some other citations


View entire conversation on ReviewNB

Replaced direct 'pip' calls with 'python -m pip' in CONTRIBUTING.md to ensure the correct pip is used within the conda environment. Added a note explaining the reason for this change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants